FastSpar: rapid and scalable correlation estimation for compositional data
نویسندگان
چکیده
منابع مشابه
Correlation Analysis for Compositional Data
Compositional data need a special treatment prior to correlation analysis. In this paper we argue why standard transformations for compositional data are not suitable for computing correlations, and why the use of raw or log-transformed data is neither meaningful. As a solution, a procedure based on balances is outlined, leading to sensible correlation measures. The construction of the balances...
متن کاملScalable Data Correlation
The fast capacity growth of cheap storage presents an ever-escalating problem for forensic investigations as currently employed forensic technologies are not designed to scale to the degree necessary to meet the challenge. In this work, we present an approach which seeks to scale up the process of finding related digital artifacts across large data sets by employing an advanced version of our s...
متن کاملDiscriminant analysis for compositional data and robust parameter estimation
Abstract Compositional data, i.e. data including only relative information, need to be transformed prior to applying the standard discriminant analysis methods that are designed for the Euclidean space. Here it is investigated for linear, quadratic, and Fisher discriminant analysis, which of the transformations lead to invariance of the resulting discriminant rules. Moreover, it is shown that f...
متن کاملCCLasso: correlation inference for compositional data through Lasso
MOTIVATION Direct analysis of microbial communities in the environment and human body has become more convenient and reliable owing to the advancements of high-throughput sequencing techniques for 16S rRNA gene profiling. Inferring the correlation relationship among members of microbial communities is of fundamental importance for genomic survey study. Traditional Pearson correlation analysis t...
متن کاملHierarchically Compositional Kernels for Scalable Nonparametric Learning
We propose a novel class of kernels to alleviate the high computational cost of large-scale nonparametric learning with kernel methods. The proposed kernel is defined based on a hierarchical partitioning of the underlying data domain, where the Nyström method (a globally low-rank approximation) is married with a locally lossless approximation in a hierarchical fashion. The kernel maintains (str...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2018
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/bty734